The purpose of this project is to investigate the provision of timely and effective emergency department care in Pennsylvania. The aims of the project are:
+ To compare emergency department access and ambulatory care metrics in Pennsylvania to that of other states and to national averages.
+ To compare emergency department access and ambulatory care metrics within Pennsylvania at the county level.
+ To investigate if there are any relationships between socioeconomic and demographic variables and access and care metrics using multivariate regressions.
Access to health care means having “the timely use of personal health services to achieve the best health outcomes” (IOM, 1993). Timely and effective ambulatory care is an important cornerstone of healthcare access and is essential for good patient outcomes and satisfaction. This is especially true in the US, where the cost of healthcare is high and ever increasing, but healthcare outcomes remain poor, especially for a developed nation. It is important that care not only be evidence-based and clinically indicated, but also timely and prompt, with delivery of the “right care at the right time.”
Ambulatory care includes primary care, outpatient care and emergency room care. Ideally, patients can use primary care and outpatient for usual care, and only visit the ED for emergent conditions. Unfortunately, given the cost of healthcare and differential access to quality insurance, many patients use the ED as a ready alternative to their usual source of medical care. As such, many US adults instead visit hospital EDs with problems that could have been managed appropriately in general primary care practice.
This added strain on emergency departments across the nation has led to issues of overcrowding, long wait times and significant strain on limited resources. This is not without its repercussions, as this thereby limits the quality and timeliness of care through the ED. The frequency of ED visits is thought particularly high among people who do not have a regular source for health care or who are uninsured.
Because people are substituting the ER for regular inpatient and outpatient treatments, we need to look specifically at the ER to get a better idea of what’s happening to these groups and if ED care is meeting these needs. Thus, in this interdisciplinary project, I attempt to examine metrics of timely and effective emergency department care and how they vary in Pennsylvania.
With respect to population health, Pennsylvania falls below national averages, ranking 28th among the 50 states in 2019 according to United Health Foundation’s America’s Health Rankings. PA ranked #28 for preventable hospitalizations. As in other states across the country, measures of health status in Pennsylvania vary by geographic and socioeconomic factors which, according to the AARP Pennsylvania in 2019, are combining to restrict access to health care services in Pennsylvania. According to AARP, these structural health disparities are particularly noted among those living in rural and low-resourced areas of the state and among URM communities (Black/African American and Latino), all of whom lack access to health care and culturally appropriate, evidence-based health promoting and chronic disease management practices.
In the EM landscape, there are numerous proxies for Timely and Effective Care (TEC).
+ The goal for patients with stroke should be a door-to-head CT time within 20 minutes in ≥50% of stroke patients who may be candidates for IV tPA or mechanical thrombectomy.
+ The goal for patients with ST-elevation myocardial infarction should be a door-to-needle time within 30 minutes (for thrombolysis) and a door-to-balloon time within 90 minutes (for PCI).
+ The goal for patients with ST-elevation myocardial infarction at a non-equipped hospital to be transferred to to a specialty hospital should be a door-in-door-out time within 30 minutes.
Additionally, given the wide publicity that long wait times has attracted it has attracted (on average, >2 hours across the US), ED waiting time is also an appropriate proxy.
This is an interdisciplinary issue, as it has implications for physicians, public health advocates, epidemiologists, economists, public service advocates, etc. As such, addressing the problem of timely access to care requires a multidisciplinary approach, as I have undertaken here.This project was supported by Dr. Gary Weissman, MD, MSHP (Assistant Professor of Medicine and Senior Fellow Leonard Davies Institute) and Dr. John Holmes PhD, FACE, FACMI (Professor of Medical Informatics in Epidemiology, Associate Director of the Institute for Biomedical Informatics), who both provided invaluable advice and guidance for mapping the data and assumptions for service areas.
Major data sources used in this project are listed below. These sources are all publicly available.
+ “TEC data”: Timely and Effective Care (TEC) data from CMS through 2020. Contains national, county-level and hospital-level data.
+ “Census data”: US Census Bureau data available in the tidycensus R package. Using acs5 survey 5-year data ending 2019.
First I investigated the TEC data: I visualized how TEC metrics vary in Pennsylvania as compared to other states regionally and nationally. Then, I visualized how TEC metrics. Then, I visualized variations of TEC metrics in Pennsylvania at the county-level using maps, highlighting the two most populous counties of Philadelphia County (with capital city of Philadelphia) and Alleghany County (with second major city of Pittsburgh).
Next, I investigated the socioeconomic and demographic data, visualizing variations by county in PA as they vary across PA by county.
Finally, I investigated the relationships between the TEC data and socioeconomic data using multivariate regression. In order to do this, I needed to overcome the problem that the TEC data is at the hospital level and the socioeconomic/demographic data is at the county level by converting both to zip code level data. For the TEC data, I used the hospital addresses calculate catchment/service areas around each hospital ED, assuming a 5-mile radius around hospitals based on zip code.
All necessary packages are installed and loaded, and necessary keys are registered.
## Loading required package: knitr
Functions necessary for later processing are created.
# Theme for maps
my_theme <- function() {
theme_minimal() +
theme(
axis.line = element_blank(),
axis.text = element_blank(),
axis.title = element_blank(),
panel.grid = element_line(color = "white"),
legend.key.size = unit(0.5, "cm"),
legend.text = element_text(size = 8),
legend.title = element_text(size = 10),
plot.title = element_text(size = 14),
#hjust = 0.5),
#panel.grid.major = element_line(color = "#ebebe5", size = 0.2),
#panel.grid.minor = element_blank(),
#plot.background = element_rect(fill = "#f5f5f2", color = NA),
#panel.background = element_rect(fill = "#f5f5f2", color = NA),
#legend.background = element_rect(fill = "#f5f5f2", color = NA),
panel.border = element_blank())
}
# Min for map scales
prev_min <- function(x) {
min(c(x),na.rm=TRUE)
}
# Max for map scales
prev_max <- function(x) {
max(c(x), na.rm=TRUE)
}
# Theme for bar graphs
my_theme_bar <- function() {
theme_minimal() +
theme(
axis.line = element_blank(),
panel.grid = element_line(color = "white"),
legend.key.size = unit(0.5, "cm"),
legend.text = element_text(size = 8),
legend.title = element_text(size = 10),
plot.title = element_text(size = 14),
panel.border = element_blank())
}
# Percent function
percent <- function(x, format = "f"){
paste0(formatC(x, format = format, digits = 2), "%")
}
# Function to extract legend for shared graphs
g_legend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend)}Necessary inputs are downloaded and/or read.
+ TEC data is downloaded directly from the CMS website.
+ Counties map data from the tigris package.
+ 2019 County population data downloaded from US Census Bureau County Population Totals 2010-2019.
# TEC data
tec_hosp <- read.csv(file = "https://data.cms.gov/provider-data/sites/default/files/resources/350f34f9ef3d484925d49dfcce7a0f54_1632355520/Timely_and_Effective_Care-Hospital.csv")
tec_state <- read.csv(file = "https://data.cms.gov/provider-data/sites/default/files/resources/0b2db52e1ce72942f369cc00c00649f8_1632355520/Timely_and_Effective_Care-State.csv")
tec_natl <- read.csv(file = "https://data.cms.gov/provider-data/sites/default/files/resources/4766dbf7fd2de19618a27f5546954463_1632355520/Timely_and_Effective_Care-National.csv")
# PA counties data
pa_counties<-tigris::counties(state = "42", cb = TRUE, progress_bar = FALSE) #FIPS 42 is Pennsylvania
# PA counties population data
pa_pop<- readxl::read_xlsx("/Users/moniquearnold/Documents/PSOM/BMIN 503/Final Project/BMIN503_Final_Project/Data/co-est2019-annres-42.xlsx",range="A6:M72",
col_names = c("County","Census","Estimate Base",
"pop_2010",
"pop_2011",
"pop_2012",
"pop_2013",
"pop_2014",
"pop_2015",
"pop_2016",
"pop_2017",
"pop_2018",
"pop_2019"))
# Cleaning data
tec_state$NAME <- state.name[match(tec_state$State,state.abb)]
tec_state$Score<- as.numeric(tec_state$Score)
pa_pop$County<- gsub('^\\.|\\.$', '', pa_pop$County)
pa_pop$NAME<- gsub(" County[,%] Pennsylvania$","", pa_pop$County)First, national data is visualized using us_states dataset provided by the spData package. Note: ‘total_pop_15’ is the numerical vector of total population in 2015. For simplicity only the contiguous United States is considered.
Metrics investigated are:
+ OP_18b: Average (median) time patients spent in the ED
+ OP_18c: Average (median) time patients spent in the ED (Psychiatric/Mental Health Patients)
+ OP_2: Percentage of outpatients with chest pain or possible heart attack who got drugs to break up blood clots within 30 minutes of arrival
+ OP_3b: Percentage of patients who came to the ED with stroke symptoms who received brain scan results within 45 minutes of arrival
+ OP_23: Average (median) number of minutes before outpatients with chest pain or possible heart attack who needed specialized care were transferred to another hospital
# Select variables to investigate
varlist <- c("OP_18b","OP_18c","OP_2","OP_23","OP_3b")
state_vars <- tec_state %>%
dplyr::select(NAME,State,Measure.ID,Score) %>%
filter(Measure.ID %in% varlist) %>%
drop_na() %>%
reshape(idvar = c("State","NAME"), timevar = "Measure.ID", direction = "wide", varying=varlist)
# Merge to usa map data
usa <- us_states
to_map_natl<- merge(usa,state_vars, stringsAsFactors=FALSE)
str(to_map_natl) #check## Classes 'sf' and 'data.frame': 48 obs. of 13 variables:
## $ NAME : chr "Alabama" "Arizona" "Arkansas" "California" ...
## $ GEOID : chr "01" "04" "05" "06" ...
## $ REGION : Factor w/ 4 levels "Norteast","Midwest",..: 3 4 3 4 4 1 3 3 3 4 ...
## $ AREA : Units: [km^2] num 133709 295281 137690 409747 269573 ...
## $ total_pop_10: num 4712651 6246816 2872684 36637290 4887061 ...
## $ total_pop_15: num 4830620 6641928 2958208 38421464 5278906 ...
## $ State : chr "AL" "AZ" "AR" "CA" ...
## $ OP_18b : num 139 174 126 163 140 163 194 152 145 129 ...
## $ OP_18c : num 213 291 201 269 234 278 300 225 289 181 ...
## $ OP_2 : num 51 69 38 48 60 33 NA 54 39 33 ...
## $ OP_23 : num 62 72 73 73 73 75 68 75 70 62 ...
## $ OP_3b : num 74 71 62 78 56 64 NA 56 72 62 ...
## $ geometry :sfc_MULTIPOLYGON of length 48; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:51, 1:2] -88.2 -88.2 -87.4 -86.9 -85.6 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
## ..- attr(*, "names")= chr [1:12] "NAME" "GEOID" "REGION" "AREA" ...
to_map_northeast<- to_map_natl %>% filter(REGION=="Norteast")
to_map_pa <- to_map_natl %>% filter(NAME=="Pennsylvania")
#CANNOT GET THIS TO WORK
# for (i in varlist){
# p <- ggplot() +
# geom_sf(data = to_map_natl, aes(fill = i), lwd=0, colour="white") +
# coord_sf() +
# my_theme() +
# scale_fill_viridis(discrete=FALSE, name = "Time (mins)", option = "mako", direction = -1,
# limit = range(prev_min(to_map_natl[[i]]), prev_max(to_map_natl[[i]]))) +
# labs(title= "Average (median) time patients spent in the ED",
# subtitle= "July - December 2020",
# caption = "Source: Centers for Disease Control") +
# theme(legend.position = "bottom")
# print(p)
# }
# Plot maps
#OP_18b
map_natl_OP_18b<-ggplot() +
geom_sf(data = to_map_natl, aes(fill = OP_18b), lwd=0, colour="white") +
geom_sf(data = to_map_pa, fill = NA, color = "red", size = 0.8) +
coord_sf() +
my_theme() +
scale_fill_viridis(discrete=FALSE, name = "Time (mins)", option = "mako", direction = -1,
limit = range(prev_min(to_map_natl$OP_18b), prev_max(to_map_natl$OP_18b))) +
labs(title= "Fig. 1: Average (median) time patients spent in the ED",
subtitle= "July - December 2020",
caption = "Source: Centers for Disease Control. Pennsylvania state is highlighted.") +
theme(legend.justification = c(0, 1),legend.position = c(0, 0.35)) +
geom_text(aes(x = -85, y = 35, hjust="left",
label = "Pennslvania median = 156 mins \nUS national median = 148 mins"), stat = "unique")
# map_natl_OP_18b_NE<-ggplot(to_map_northeast) +
# geom_sf(aes(fill = OP_18b), lwd=0, colour="white") +
# geom_sf_label(aes(label = State)) +
# coord_sf() +
# my_theme() +
# scale_fill_viridis(discrete=FALSE, name = "Time (mins)", option = "mako", direction = -1,
# limit = range(prev_min(to_map_natl$OP_18b), prev_max(to_map_natl$OP_18b))) +
# labs(subtitle= "Northeast") +
# theme(legend.position = "none")
#
# lay <- rbind(c(1,1,1,NA),
# c(1,1,1,2),
# c(1,1,1,2))
# grid.arrange(map_natl_OP_18b, map_natl_OP_18b_NE, layout_matrix = lay)
map_natl_OP_18c<-ggplot() +
geom_sf(data = to_map_natl, aes(fill = OP_18c), lwd=0) +
geom_sf(data = to_map_pa, fill = NA, color = "red", size = 0.8) +
coord_sf() +
my_theme() +
scale_fill_viridis(discrete=FALSE, name = "Time (mins)", option = "mako", direction = -1,
limit = range(prev_min(to_map_natl$OP_18c), prev_max(to_map_natl$OP_18c))) +
labs(title= "Fig. 2: Average (median) time patients spent in the ED \n(Psychiatric/Mental Health Patients)",
subtitle= "July - December 2020",
caption = "Source: Centers for Disease Control. Pennsylvania state is highlighted.") +
theme(legend.justification = c(0, 1),legend.position = c(0, 0.35)) +
geom_text(aes(x = -85, y = 35, hjust="left",
label = "Pennslyvania median = 263 mins \nUS national median = 248 mins"), stat = "unique")
map_natl_OP_2<-ggplot() +
geom_sf(data = to_map_natl, aes(fill = OP_2), lwd=0) +
geom_sf(data = to_map_pa, fill = NA, color = "red", size = 0.8) +
coord_sf() +
my_theme() +
scale_fill_viridis(discrete=FALSE, name = "% of patients", option = "magma", direction = -1,
limit = range(prev_min(to_map_natl$OP_2), prev_max(to_map_natl$OP_2))) +
labs(title= "Fig. 3: Percentage of outpatients with chest pain or possible heart \nattack who got drugs to break up blood clots within 30 minutes of arrival",
subtitle= "July - December 2020",
caption = "Source: Centers for Disease Control. Pennsylvania state is highlighted.") +
theme(legend.justification = c(0, 1),legend.position = c(0, 0.35)) +
geom_text(aes(x = -85, y = 35, hjust="left",
label = "Pennslyvania median = 32% \nUS national median = 52%"), stat = "unique")
map_natl_OP_23<-ggplot() +
geom_sf(data = to_map_natl, aes(fill = OP_23), lwd=0) +
geom_sf(data = to_map_pa, fill = NA, color = "red", size = 0.8) +
coord_sf() +
my_theme() +
scale_fill_viridis(discrete=FALSE, name = "% of patients", option = "cividis", direction = 1,
limit = range(prev_min(to_map_natl$OP_23), prev_max(to_map_natl$OP_23))) +
labs(title= "Fig. 4: Percentage of patients who came to the ED with stroke symptoms \nwho received brain scan results within 45 minutes of arrival",
subtitle= "July - December 2020",
caption = "Source: Centers for Disease Control. Pennsylvania state is highlighted.") +
theme(legend.justification = c(0, 1),legend.position = c(0, 0.35)) +
geom_text(aes(x = -85, y = 35, hjust="left",
label = "Pennslyvania median = 65% \nUS national median = 72%"), stat = "unique")
map_natl_OP_3b<-ggplot() +
geom_sf(data = to_map_natl, aes(fill = OP_3b), lwd=0) +
geom_sf(data = to_map_pa, fill = NA, color = "red", size = 0.8) +
coord_sf() +
my_theme() +
scale_fill_viridis(discrete=FALSE, name = "Time (mins)", option = "magma", direction = -1,
limit = range(prev_min(to_map_natl$OP_3b), prev_max(to_map_natl$OP_3b))) +
labs(title= "Fig. 5: Average (median) number of minutes before outpatients with chest pain \nor possible heart attack who needed specialized care were transferred to \nanother hospital",
subtitle= "July - December 2020",
caption = "Source: Centers for Disease Control. Pennsylvania state is highlighted.") +
theme(legend.justification = c(0, 1),legend.position = c(0, 0.35)) +
geom_text(aes(x = -85, y = 35, hjust="left",
label = "Pennslyvania median = 76 mins \nUS national median = 61 mins"), stat = "unique")
map_natl_OP_18bmap_natl_OP_18cmap_natl_OP_2map_natl_OP_23map_natl_OP_3bNext, median metrics of Pennsylvania are compared to those of other states in the Northeast.
Note: Since only the final median values and not the underlying data, I cannot perform any statistical comparisons.
# Determine averages
fg_northeast<- as.data.frame(to_map_natl) %>%
filter(REGION=="Norteast") %>%
dplyr::select(NAME,OP_18b,OP_18c,OP_2,OP_23,OP_3b)
# Reshape for barchart
forgraph_northeast <- melt(setDT(fg_northeast), id.vars = c("NAME"), variable.name = "score")
# fg_ne_sum<-summarize_all(fg_northeast,mean,na.rm = TRUE) %>% mutate(NAME="Northeast States")
#
# fg_natl <- tec_natl %>%
# dplyr::select(Measure.ID,Score) %>%
# mutate(NAME="US National") %>%
# filter(Measure.ID %in% varlist) %>%
# reshape(idvar="NAME", timevar = "Measure.ID", direction = "wide", varying=varlist)
#forgraph_all <- rbind(fg_northeast,fg_ne_sum)
# Plot graphs
bar_OP_18b18c <- forgraph_northeast %>%
filter(variable=="OP_18b" | variable=="OP_18c") %>%
ggplot(aes(x=reorder(factor(NAME),value), y=value, fill=variable)) +
geom_bar(stat="identity", colour = "black", position = "stack") +
my_theme_bar() +
theme(axis.title.y=element_blank()) +
geom_text(aes(label=value), vjust=1, color="white", size=3.5) +
scale_y_continuous(name="Median Time (mins)") +
coord_flip() +
labs(title= "Fig. 6: Average (median) time patients spent in the ED",
subtitle= "July - December 2020",
caption = "Source: Centers for Disease Control. Pennsylvania state is highlighted.") +
scale_fill_brewer(palette = "Set1",
breaks=c("OP_18b", "OP_18c"),
labels=c("All Patients", "Psychiatric/Mental Health Patients")) +
theme(legend.title=element_blank())
bar_OP_2_23 <- forgraph_northeast %>%
filter(variable=="OP_2" | variable=="OP_23") %>%
ggplot(aes(x=reorder(factor(NAME),value), y=value, fill=variable)) +
geom_bar(stat="identity", colour = "black", position = "stack") +
my_theme_bar() +
theme(axis.title.x=element_blank()) +
geom_text(aes(label=value), vjust=1, color="white", size=3.5) +
scale_y_continuous(name="% of patients") +
labs(title="Fig. 7: Percent of ED Patients Presenting with Acute Symptoms who \nReceived Timely Care ",
subtitle= "July - December 2020",
caption = "Source: Centers for Disease Control. Pennsylvania state is highlighted.") +
scale_fill_brewer(palette = "Accent",
breaks=c("OP_2", "OP_23"),
labels=c("% pts with chest pain or possible heart attack \nwho got drugs to break up blood clots within 30 minutes of arrival", "% pts with stroke symptoms who received \nbrain scan results within 45 minutes of arrival")) +
theme(legend.title=element_blank(), legend.position="bottom")
bar_OP_18b18cbar_OP_2_23A brief look at the wait times for the hospitals.
# Subset data to metrics of interest and PA
df_hosp <- tec_hosp %>% filter(Measure.ID %in% varlist & State=="PA")
## General hospitals
# Get list of general hospital names
df_hosp_fac <- df_hosp %>%
mutate(addr=paste0(Address,", ",City, ", ",State," ",ZIP.Code)) %>%
distinct(Facility.Name,addr,Measure.ID,Score) %>%
arrange(Facility.Name) %>%
reshape(idvar = c("Facility.Name","addr"), timevar = "Measure.ID", direction = "wide", varying=varlist)
df_hosp_fac$OP_18b <- as.numeric(df_hosp_fac$OP_18b)
df_hosp_fac$OP_18c <- as.numeric(df_hosp_fac$OP_18c)
df_hosp_fac$OP_2 <- as.numeric(df_hosp_fac$OP_2)
df_hosp_fac$OP_23 <- as.numeric(df_hosp_fac$OP_23)
df_hosp_fac$OP_3b <- as.numeric(df_hosp_fac$OP_3b)
df_hosp_fac$addr <- gsub('34TH ST &', '3400', df_hosp_fac$addr)
df_hosp_fac$addr <- gsub('34TH &', '3400', df_hosp_fac$addr)
# Format table
dt_pahosp <- df_hosp_fac %>% dplyr::select(Facility.Name,OP_18b,OP_18c) %>% filter(! is.na(OP_18b) | ! is.na(OP_18c))
dt_pahosp$Facility.Name<-str_to_title(dt_pahosp$Facility.Name)
dt_pahosp$OP_18b_time <- dt_pahosp$OP_18b %>% hms::as_hms() %>% as.POSIXlt() %>% format("%M:%S")
dt_pahosp$OP_18c_time <- dt_pahosp$OP_18c %>% hms::as_hms() %>% as.POSIXlt() %>% format("%M:%S")
dt_pahosp <- dt_pahosp %>% dplyr::rename('Hospital'=Facility.Name,
'Time Spent in ED - All Patients'=OP_18b_time,
'Time Spent in ED - Mental Health Patients'=OP_18c_time)
# Calculate means
dt_pahosp_means <- dt_pahosp %>% dplyr::summarize(mean_wait_all=mean(OP_18b, na.rm = TRUE),
mean_wait_bhv=mean(OP_18c, na.rm = TRUE))
dt_pahosp_means$`All Patients` <- dt_pahosp_means$mean_wait_all %>%
hms::as_hms() %>% as.POSIXlt() %>% format("%M:%S")
dt_pahosp_means$`Mental Health Patients` <- dt_pahosp_means$mean_wait_bhv %>%
hms::as_hms() %>% as.POSIXlt() %>% format("%M:%S")
dt_pahosp_means <- dt_pahosp_means %>% dplyr::select(-mean_wait_all,-mean_wait_bhv) %>% gather()
# Final tables
head(dt_pahosp[order(dt_pahosp$`Time Spent in ED - All Patients`,decreasing=TRUE),],n=10) %>%
dplyr::select(-OP_18b,-OP_18c) %>%
kable(caption="Table 1. Longest Patient Wait Times at PA Hospitals (Top 10)",
align="lcc",
row.names = FALSE) %>%
kable_styling(fixed_thead = T, bootstrap_options = c("hover", "condensed"), full_width = F)| Hospital | Time Spent in ED - All Patients | Time Spent in ED - Mental Health Patients |
|---|---|---|
| Jefferson Health- Northeast | 06:26 | NA |
| Milton S Hershey Medical Center | 04:54 | 08:24 |
| Hospital Of Univ Of Pennsylvania | 04:11 | 04:44 |
| Chester County Hospital | 04:05 | 10:08 |
| Penn Presbyterian Medical Center | 03:57 | 04:34 |
| Robert Packer Hospital | 03:44 | 06:07 |
| Upmc Memorial | 03:44 | 04:24 |
| Penn State Health St. Joseph | 03:42 | 04:53 |
| Main Line Hospital Lankenau | 03:38 | NA |
| Einstein Medical Center Montgomery | 03:37 | NA |
tail(dt_pahosp[order(dt_pahosp$`Time Spent in ED - All Patients`,decreasing=TRUE),],n=10) %>%
dplyr::select(-OP_18b,-OP_18c) %>%
kable(caption="Table 2. Shortest Patient Wait Times at PA Hospitals (Bottom 10)",
align="lcc",
row.names = FALSE) %>%
kable_styling(fixed_thead = T, bootstrap_options = c("hover", "condensed"), full_width = F)| Hospital | Time Spent in ED - All Patients | Time Spent in ED - Mental Health Patients |
|---|---|---|
| Millcreek Community Hospital | 01:36 | 03:28 |
| Ahn Emerus Westmoreland, Llc | 01:31 | NA |
| Penn Highlands Tyrone | 01:31 | NA |
| Highlands Hospital | 01:30 | 04:05 |
| Upmc Kane | 01:30 | 03:02 |
| Mercy Catholic Medical Center- Mercy Fitzgerald | 01:28 | 04:44 |
| Lecom Health Corry Memorial Hospital | 01:24 | NA |
| Grove City Medical Center | 01:22 | NA |
| Bucktail Medical Center | 01:19 | NA |
| Conemaugh Meyersdale Medical Center | 01:08 | NA |
dt_pahosp_means %>%
kable(caption="Table 3. Mean Patient Wait Times at PA Hospitals ",
align="lr",
row.names = FALSE,
col.names=NULL) %>%
kable_styling(fixed_thead = T, bootstrap_options = c("hover", "condensed"), full_width = F)| All Patients | 02:40 |
| Mental Health Patients | 05:03 |
Next, I explore the availability of hospitals across Pennsylvania. I will map hospital densities using the hospital data from TEC.I will map all hospitals, then hospitals across a few key service lines, namely: pediatric hospitals, trauma centers and stroke centers.
Using the package ggmap, I will get latitude and longitude data for each hospital address in the TEC data using the Google API. To do this, I registered for a free Google console account, generated an API key that I registered in R Studio, and enabled the following APIs on the Google Console:
+ Directions API
+ Geocoding API
+ Geolocation API
+ Places API
+ Maps Javascript AVI
+ Maps Embed API
I then converted the data frame of resultant cartesion coordinates to an sf object. I used the the projection WGS84, which is the CRS code #4326, a priori defined in the sf object. I mapped the hospitals onto a map of Pennsylvania using the counties dataset tigris library, overlayed with county population data.
## All hospitals
# getting map coordinates (lat and long) of hospital addresses
for(i in 1:nrow(df_hosp_fac))
{
ifelse(is.na(df_hosp_fac$addr[i]), NA,
hosp_ggmap_pa <- geocode(df_hosp_fac$addr[i],
output = "more",
source = "google",
messaging = FALSE))
df_hosp_fac$Longitude[i] <- as.numeric(hosp_ggmap_pa[1])
df_hosp_fac$Latitude[i] <- as.numeric(hosp_ggmap_pa[2])
}
df_hosp_fac_tib <- as_tibble(df_hosp_fac)
df_hosp_fac_sf <- st_as_sf(df_hosp_fac_tib, coords = c("Longitude", "Latitude"), crs = 4326)
mapview(df_hosp_fac_sf) #Quickly checking addresses geocoded correctly # merge to PA county data to PA population data
pa_tomap<-merge(pa_counties, pa_pop, by="NAME")
class(pa_tomap) #check## [1] "sf" "data.frame"
## Specialty hospitals
varlist_hosp <- c("Childrens","Stroke","Trauma")
for (var in varlist_hosp){
# Initialize table
tablet <- as.character(paste0("list_pa_",var))
# Import
list <- readxl::read_xlsx("/Users/moniquearnold/Documents/PSOM/BMIN 503/Final Project/BMIN503_Final_Project/Data/Specialty Hospitals.xlsx", sheet=var)
# Getting map coordinates (lat and long) of specialty hospital addresses
list2 <- cbind(list,
type_hosp=var,
geocode(list$Hospital,
output = "more",
source = "google",
messaging = FALSE))
list_tib <- as_tibble(list2)
list_sf <- st_as_sf(list_tib, coords = c("lon", "lat"), crs = 4326)
# Assign final table
assign(tablet,list_sf)
}
# checking addresses geocoded correctly
# mapview(list_pa_Childrens)
# mapview(list_pa_Stroke)
# mapview(list_pa_Trauma)
spec_sf <- st_as_sf(rbind.fill(list_pa_Childrens,list_pa_Stroke,list_pa_Trauma)) %>% dplyr::select(-'...3')
class(spec_sf)## [1] "sf" "data.frame"
# Creating annotations for graphs
high_pa_tomap <- pa_tomap %>% filter(pop_2019 > 1000000) #labelling counties with pop > 1 million
for(i in 1:nrow(high_pa_tomap)){
ifelse(is.na(high_pa_tomap$County[i]), NA,
high_pa_tomap2 <- geocode(high_pa_tomap$County[i], output = "latlon", source = "google", messaging = FALSE))
high_pa_tomap$lon[i] <- as.numeric(high_pa_tomap2[1])
high_pa_tomap$lat[i] <- as.numeric(high_pa_tomap2[2])
}
all_hosp_sf <- st_as_sf(rbind.fill(spec_sf,df_hosp_fac_sf) %>%
mutate(type_hosp = ifelse(is.na(type_hosp),"All Hospitals",type_hosp)))
# Maps
ggplot() +
geom_sf(data = pa_tomap, aes(fill = pop_2019/100000), color="white") +
geom_sf(data = df_hosp_fac_sf, size = 2, shape = 21, fill = "gold") +
geom_sf_label_repel(data = high_pa_tomap, aes(x=lon,y=lat,label=NAME),
size=3, force=7, nudge_y=-6, nudge_x=1, seed = 20,
alpha=0.8) +
coord_sf() +
my_theme() +
scale_fill_viridis(discrete=FALSE, name = "Population (in 100,000s)", option = "mako", direction = -1,
limit = range(prev_min(pa_tomap$pop_2019/100000),
prev_max(pa_tomap$pop_2019/100000))) +
labs(title= "Fig. 8: Pennsylvania Specialty Service Line Hospitals",
caption = "Source: Centers for Disease Control, US Census Bureau (2019 Census).") +
theme(legend.justification = c(0, 1),legend.position = "bottom")ggplot() +
geom_sf(data = pa_tomap, aes(fill = pop_2019/100000), color="white") +
geom_sf(data = spec_sf, size = 2, shape = 21, fill = "gold") +
# geom_sf_label_repel(data = high_pa_tomap, aes(x=lon,y=lat,label=NAME),
# size=3, force=7, nudge_y=-6, nudge_x=1, seed = 20,
# alpha=0.8) +
facet_wrap(~type_hosp, shrink=FALSE, dir="v", nrow=1) +
coord_sf() +
my_theme() +
scale_fill_viridis(discrete=FALSE, name = "Population (in 100,000s)", option = "mako", direction = -1,
limit = range(prev_min(pa_tomap$pop_2019/100000),
prev_max(pa_tomap$pop_2019/100000))) +
labs(title= "Fig. 9: Pennsylvania Specialty Service Line Hospitals",
caption = "Source: Centers for Disease Control, US Census Bureau (2019 Census).") +
theme(legend.justification = c(0, 1),
legend.position = "bottom",
legend.text = element_text(size = 6))Next, I will investigate different key socioeconomic/demographic factors on average and by county in PA using the tidycensus package and 5 year ACS data ending in 2019. Key factors investigated are:
+ Poverty rate (DP03_0119PE)
+ Median household income (DP03_0062E)
+ Percentage population 25 years and over with high school grad or higher (DP02_0066PE)
+ Percentage population without health insurance coverage (DP03_0099PE)
+ Percentage population with public health insurance coverage (DP03_0098PE)
+ Percentage population with private health insurance (DP03_0097PE)
+ Percentage of population White (DP05_0037PE)
+ Percentage of population Asian (DP05_0044PE)
+ Percentage of population Black or African American (DP05_0038PE)
+ Percentage of population Hispanic or Latino (DP05_0078PE)
+ Percentage of population American Indian and Alaska Native (DP05_0039PE)
# Obtaining 2019 acs5 variable list to investigate
dp17 <- load_variables(2019, "acs5/profile", cache = TRUE)
# Obtaining COUNTY-LEVEL socioeconomic/demographic data
pa_factors <- get_acs(geography = "county",
variables = c(povertyrate = "DP03_0119P",
medincome = "DP03_0062",
edurate = "DP02_0067P",
unemploymentrate = "DP03_0009P",
nohealthins = "DP03_0099P",
pubhealthins = "DP03_0098P",
privhealthins = "DP03_0097P",
racewhite = "DP05_0077P",
raceasian = "DP05_0080P",
raceblack = "DP05_0078P",
raceamerind = "DP05_0079P",
racehawaii = "DP05_0081P",
racehispanic = "DP05_0071P",
raceother = "DP05_0082P"),
state = 42, #PA is 42
year = 2019,
geometry=TRUE)
unique(pa_factors$variable)## [1] "edurate" "unemploymentrate" "medincome" "privhealthins" "pubhealthins" "nohealthins" "povertyrate" "racehispanic" "racewhite" "raceblack" "raceamerind" "raceasian" "racehawaii" "raceother"
tableout <- data.frame(cbind(unique(pa_factors$variable),
c("Percentage of Population 25y+ with High Diploma or Equivalency",
"Unemployment Rate (%)",
"Median Household Income (in millions of USD)",
"Percentage of population with private health insurance",
"Percentage of population with public health insurance coverage",
"Percentage of population without health insurance coverage ",
"Poverty Rate (%)",
"Race -- % Hispanic or Latino",
"Race -- % White",
"Race -- % Black or African American",
"Race -- % American Indian and Alaska Native",
"Race -- % Asian",
"Race -- % Native Hawaiian and Other Pacific Islander",
"Race -- % Other Race")))
table_soc <- data.frame(pa_factors) %>%
group_by(variable) %>%
dplyr::select(NAME,variable, estimate) %>%
mutate(estimate = as.numeric(estimate),
variable = as.character(variable)) %>%
dplyr::summarize(mean_estimate=mean(estimate, na.rm = TRUE)) %>%
mutate(mean_estimate = ifelse(variable %in% c("medincome"),
scales::dollar(mean_estimate),
percent(mean_estimate))) %>%
left_join(tableout,by=c("variable"="X1")) %>%
dplyr::rename(`Socioeconomic/Demographic Factor`=X2,
`Mean Value`=mean_estimate)
table_soc %>% dplyr::select(`Socioeconomic/Demographic Factor`,`Mean Value`) %>%
kable(caption="Table 4. Socioeconomic and Demographic Factors Across PA Counties",
align="lr",
digits = 3,
col.names=c("Factor","Mean Value")) %>%
kable_styling(fixed_thead = T, bootstrap_options = c("hover", "condensed"), full_width = F)| Factor | Mean Value |
|---|---|
| Percentage of Population 25y+ with High Diploma or Equivalency | 89.89% |
| Median Household Income (in millions of USD) | $57,056.09 |
| Percentage of population without health insurance coverage | 5.91% |
| Poverty Rate (%) | 8.31% |
| Percentage of population with private health insurance | 72.23% |
| Percentage of population with public health insurance coverage | 39.20% |
| Race – % American Indian and Alaska Native | 0.10% |
| Race – % Asian | 1.50% |
| Race – % Black or African American | 4.44% |
| Race – % Native Hawaiian and Other Pacific Islander | 0.01% |
| Race – % Hispanic or Latino | 4.46% |
| Race – % Other Race | 0.09% |
| Race – % White | 87.87% |
| Unemployment Rate (%) | 5.04% |
Mapping socioeconomic/demographic variables.
# Mapping variables
tableout_econ <- tableout %>% filter(X1 %in% c("edurate","unemploymentrate","povertyrate","medincome"))
counter = 0
for(i in unique(pa_factors$variable)){
# dynamic figure labelling
counter <- counter + 1
figletter <- toupper(letters[counter])
# dynamic titling
tableout2 <- tableout_econ %>% filter(X1==i)
titleout <- as.character(paste0("Fig. 9",figletter,". Pennsylvania ",tableout2$X2," by County"))
# dynamic annotations
anno_soc <- table_soc %>% filter(variable==i)
meanlabel <-
paste0("Calculated Estimated Mean in PA: ",anno_soc$`Mean Value`)
# subset data for variable of interest
spec_table <- pa_factors %>% filter(variable==i)
var_pa_tomap <- spec_table %>%
filter(NAME=="Allegheny County, Pennsylvania" | NAME=="Philadelphia County, Pennsylvania") %>%
mutate(label=paste0(gsub(' County, Pennsylvania','',NAME),": ",estimate)) %>%
st_join(high_pa_tomap)
# final maps
demo_map <- ggplot() +
geom_sf(data = spec_table, aes(fill = estimate), color="white") +
# geom_sf(data = all_hosp_sf, size = 2, shape = 21, fill = "gold") +
geom_sf_label_repel(data = var_pa_tomap, aes(x=lon,y=lat,label=label),
size=3, force=7, nudge_y=-5, nudge_x=1, seed = 10,
fill = alpha(c("white"),0.8)) +
coord_sf() +
my_theme() +
scale_fill_viridis(discrete=FALSE, option = "rocket", direction = -1,
limit = range(prev_min(spec_table$estimate),
prev_max(spec_table$estimate))) +
labs(title = paste0(strwrap(titleout,width=50),collapse="\n"),
caption = "Source: US Census Bureau (2019 Census).",
subtitle = paste0(strwrap(meanlabel,width=50),collapse="\n")) +
theme(legend.justification = c(0, 1),
legend.position = "bottom")
# plot.tag.position = c(),
# plot.tag = element_text(vjust = 1, hjust = 1, size=8))
print(demo_map)
}Better visualizing racial makeup and health insurance variation by charts.
# Race
pa_factors_race <- pa_factors %>% data.frame() %>%
filter(grepl("race", variable, fixed = TRUE)) %>%
dplyr::select(NAME, variable, estimate) %>%
mutate(County = paste0(gsub(' County, Pennsylvania','',NAME)))
pa_factors_race_wide <- pa_factors_race %>%
dplyr::select(-NAME) %>%
reshape(idvar = c("County"), timevar = "variable", direction = "wide") %>%
dplyr::arrange(desc(estimate.racewhite), .by_group = TRUE) %>%
dplyr::rename(black=estimate.raceblack,
white=estimate.racewhite,
asian=estimate.raceasian,
amerind=estimate.raceamerind,
hispanic=estimate.racehispanic,
hpislander=estimate.racehawaii,
other=estimate.raceother)
head_pa_race_long <- melt(head(pa_factors_race_wide, n=34), id.vars=c("County"),
measure.vars=c("white","black","asian","amerind","hispanic","hpislander","other"),
variable.name="race", value.name="percent")
tail_pa_race_long <- melt(tail(pa_factors_race_wide, n=34), id.vars=c("County"),
measure.vars=c("white","black","asian","amerind","hispanic","hpislander","other"),
variable.name="race", value.name="percent")
top10 <- head_pa_race_long %>%
filter(variable == "white")%>%
arrange(-desc(value)) %>%
bind_rows(head_pa_race_long %>% filter(variable != "white")) %>%
mutate(County = factor(County, unique(County))) %>%
ggplot(aes(x=County, y=value, fill=forcats::fct_rev(variable))) +
geom_bar(stat="identity", colour = "black") +
coord_flip() +
my_theme_bar() +
theme(axis.title.x=element_blank()) +
scale_y_continuous(name="% of population") +
labs(title="Fig. 11: Racial Composition of Pennsylvania by County",
caption = "") +
scale_fill_brewer(palette = "Set3", direction=-1) +
theme(legend.title=element_blank(),
legend.position="bottom",
axis.title.x=element_blank(),
axis.title.y=element_blank())
bottom10 <- tail_pa_race_long %>%
filter(variable == "white")%>%
arrange(-desc(value)) %>%
bind_rows(tail_pa_race_long %>% filter(variable != "white")) %>%
mutate(County = factor(County, unique(County))) %>%
ggplot(aes(x=County, y=value, fill=forcats::fct_rev(variable))) +
geom_bar(stat="identity", colour = "black") +
coord_flip() +
my_theme_bar() +
theme(axis.title.x=element_blank()) +
scale_y_continuous(name="% of population") +
labs(title="",
caption = "Source: US Census Bureau.") +
scale_fill_brewer(palette = "Set3", direction=-1) +
theme(legend.title=element_blank(),
legend.position="bottom",
axis.title.x=element_blank(),
axis.title.y=element_blank())
mylegend<-g_legend(top10) #shared legend
grid.arrange(arrangeGrob(top10 + theme(legend.position="none"),
bottom10 + theme(legend.position="none"),
nrow=1),
mylegend, nrow=2,heights=c(10, 1))# Insurance
pa_noins <- pa_factors %>% data.frame() %>%
filter(grepl("ins", variable, fixed = TRUE)) %>%
dplyr::select(NAME, variable, estimate) %>%
mutate(County = paste0(gsub(' County, Pennsylvania','',NAME))) %>%
filter(variable == "nohealthins")
pa_noins %>%
arrange(-desc(estimate)) %>%
mutate(County = factor(County, unique(County))) %>%
ggplot(aes(x=County, y=estimate)) +
geom_bar(stat="identity", color="black", fill="darkorange") +
geom_abline(slope=0, intercept=5.9, col = "darkgreen",lty=2) +
geom_abline(slope=0, intercept=8.6, col = "black",lty=2) +
scale_y_continuous(name="% of population") +
my_theme_bar() +
labs(title=paste0(strwrap("Fig. 12: Percentage of Population Without Health Insurance in Pennsylvania by County",width=50),collapse="\n"),
caption = "Surce: US Census Bureau") +
theme(legend.title=element_blank(),
legend.position="bottom",
axis.title.x=element_blank(),
axis.text.x = element_text(angle = 75, hjust=0.75))Next, I run multivariate analyses to see if there are any relations between the socioeconomic/demographic variables and the TEC data. As mentioned, I first convert both to zip code by using zcta.
## Obtaining ZIP-CODE TABULATION AREA LEVEL socioeconomic/demographic data
varying <- c("edurate","unemploymentrate","medincome","nohealthins","povertyrate","racehispanic","racewhite","raceblack")
zc_factors <- get_acs(geography = "zip code tabulation area",
variables = c(povertyrate = "DP03_0119P",
medincome = "DP03_0062",
edurate = "DP02_0067P",
unemploymentrate = "DP03_0009P",
nohealthins = "DP03_0099P",
racewhite = "DP05_0077P",
raceblack = "DP05_0078P",
ethhispanic = "DP05_0071P"),
state = 42, #PA is 42
year = 2019) %>%
dplyr::select(-moe) %>%
as.data.frame() %>%
reshape(idvar = c("GEOID","NAME"), timevar = "variable", direction = "wide")
## Obtaining ZIP-CODE TABULATION AREA LEVEL TEC data for hospital wait time (most populous field)
# Calculating assumption straight line 5 mile radius from each hospital
zc_tec_OP18b <- df_hosp_fac %>% filter(!is.na(OP_18b))
lat <- as.matrix(zc_tec_OP18b$Latitude)
long <- as.matrix(zc_tec_OP18b$Longitude)
# Since search_dataset takes forever to run, I first did this code, then saved down the dataset, and for subsequent runs and knits I pulled from the save down stataset
# zc_tec_OP18b2 <- zc_tec_OP18b %>%
# rowwise() %>%
# mutate(CTD = list(search_radius(Latitude,Longitude,radius=5))) %>%
# mutate(ziplist=list(unlist(CTD[[1]])))
# # Converting the lists of zipcodes into individual columns for each zip code
# zc_tec_OP18b3 <- zc_tec_OP18b2 %>%
# as.data.frame() %>%
# mutate(ziplist2 = as.character(ziplist))
#
# zc_tec_OP18b_final<- setDT(zc_tec_OP18b3)[, paste0("zipcode", 1:34) := tstrsplit(ziplist2, '", "')]
#
# zipnames <- zc_tec_OP18b_final %>%
# dplyr:: select(starts_with("zipcode"))
#
# zc_tec_OP18b_final <- zc_tec_OP18b_final %>%
# mutate_at(.vars = names(zipnames),
# .funs = gsub,
# pattern = "[^[:alnum:] ]",
# replacement = "\\") %>%
# mutate_at(.vars = names(zipnames),
# .funs = gsub,
# pattern = "c",
# replacement = "\\") %>%
# mutate_at(.vars=names(zipnames), .funs=as.character) %>%
# dplyr::select(Facility.Name, addr, Longitude, Latitude, OP_18b, names(zipnames))
#
# # Saving dataset because search_radius takes some time to run
# write.csv(zc_tec_OP18b_final, "zc_tec_OP18b_final.csv", row.names=F)
zc_tec_OP18b_final<- read.csv(file = "zc_tec_OP18b_final.csv")
# Wrangling from long to wide to use for regressions
forreg_zc_OP18b <- gather(zc_tec_OP18b_final, label, zipcode, zipcode1:zipcode34, factor_key=TRUE) %>%
dplyr::select(-label) %>%
arrange(zipcode) %>%
na.omit %>%
filter(!zipcode=="harater0") %>%
#mutate(zipcode2=ifelse(as.numeric(forreg_zc_OP18b$zipcode)<10000, paste0("0", forreg_zc_OP18b$zipcode), forreg_zc_OP18b$zipcode)) %>%
filter(!as.numeric(zipcode)>40000) %>% #dropping ohio zipcodes
filter(!as.numeric(zipcode)<10000) #dropping new jersey zipcodes
# Calculating mean wait time for zipcodes serviced by more than one hospital
forreg_zc_mean <- forreg_zc_OP18b %>%
group_by(zipcode) %>%
summarise_at(vars(OP_18b), list(OP_18b_mean = mean))
# Lookup zcta for zipcodes using crosswalk dataset (zcta package)[https://github.com/jjchern/zcta]
zip_data <- zcta::zipzcta %>% filter(state=="PA")
zip_more <- zipcodeR::zip_code_db %>% filter(state=="PA")
forreg_zc <- forreg_zc_mean %>%
left_join(zip_data, by = c("zipcode" = "zip")) %>%
left_join(zc_factors, by=c("zcta" = "GEOID")) %>%
left_join(zip_more, by=c("zipcode" = "zipcode"))
head(forreg_zc)## # A tibble: 6 × 38
## zipcode OP_18b_mean po_name state.x zip_type zcta NAME estimate.edurate estimate.unempl… estimate.medinc… estimate.noheal… estimate.povert… estimate.ethhis… estimate.racewh… estimate.racebl… zipcode_type major_city post_office_city common_city_list county state.y lat lng timezone radius_in_miles area_code_list population population_dens… land_area_in_sq… water_area_in_s… housing_units
## <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr> <blob> <chr> <chr> <dbl> <dbl> <chr> <dbl> <blob> <int> <dbl> <dbl> <dbl> <int>
## 1 15003 135 Ambridge PA ZIP Cod… 15003 ZCTA… 93.2 4.1 50189 3.9 6.4 2.8 82.6 10.6 Standard Ambridge Ambridge, PA <raw 33 B> Beave… PA 40.6 -80.2 Eastern 3 <raw 15 B> 11861 1816 6.53 0.37 6199
## 2 15009 163 Beaver PA ZIP Cod… 15009 ZCTA… 95.6 4.2 63738 2.1 6.4 2.2 94.1 1.5 Standard Beaver Beaver, PA <raw 48 B> Beave… PA 40.7 -80.4 Eastern 6 <raw 22 B> 15082 615 24.5 1.4 6839
## 3 15014 182 Brackenridge PA ZIP Cod… 15014 ZCTA… 93.2 5.8 42500 6 14.3 1 94.5 3.8 Standard Brackenri… Brackenridge, PA <raw 24 B> Alleg… PA 40.6 -79.7 Eastern 0.625 <raw 15 B> 3184 6264 0.51 0.04 1617
## 4 15017 159 Bridgeville PA ZIP Cod… 15017 ZCTA… 94.4 3.6 67058 3.2 4.5 1.1 91.3 2.1 Standard Bridgevil… Bridgeville, PA <raw 23 B> Alleg… PA 40.4 -80.1 Eastern 4 <raw 15 B> 16213 1244 13.0 0.05 7875
## 5 15020 175 Bunola PA Post Of… 15020 ZCTA… 89.1 0 NA 13.9 0 0 100 0 PO Box Bunola Bunola, PA <raw 18 B> Alleg… PA 40.2 -80.0 Eastern 0.511 <raw 15 B> 231 175 1.32 0.46 121
## 6 15022 175 Charleroi PA ZIP Cod… 15022 ZCTA… 93.6 5.8 48051 3.3 10.7 1.7 90.5 3.5 Standard Charleroi Charleroi, PA <raw 34 B> Washi… PA 40.1 -79.9 Eastern 5 <raw 15 B> 10340 573 18.1 0.24 5391
## # … with 7 more variables: occupied_housing_units <int>, median_home_value <int>, median_household_income <int>, bounds_west <dbl>, bounds_east <dbl>, bounds_north <dbl>, bounds_south <dbl>
# Linear Regression (lm and glm produce similar results)
regout <- lm(data = forreg_zc, OP_18b_mean ~ population + population_density +
estimate.edurate + estimate.unemploymentrate + estimate.povertyrate + estimate.medincome +
estimate.nohealthins +
estimate.racewhite + estimate.raceblack + estimate.ethhispanic)
stargazer::stargazer(regout)##
## % Table created by stargazer v.5.2.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu
## % Date and time: Tue, Dec 07, 2021 - 21:18:18
## \begin{table}[!htbp] \centering
## \caption{}
## \label{}
## \begin{tabular}{@{\extracolsep{5pt}}lc}
## \\[-1.8ex]\hline
## \hline \\[-1.8ex]
## & \multicolumn{1}{c}{\textit{Dependent variable:}} \\
## \cline{2-2}
## \\[-1.8ex] & OP\_18b\_mean \\
## \hline \\[-1.8ex]
## population & 0.0003$^{***}$ \\
## & (0.0001) \\
## & \\
## population\_density & 0.001$^{**}$ \\
## & (0.0004) \\
## & \\
## estimate.edurate & 0.377 \\
## & (0.338) \\
## & \\
## estimate.unemploymentrate & $-$0.437 \\
## & (0.462) \\
## & \\
## estimate.povertyrate & $-$0.132 \\
## & (0.249) \\
## & \\
## estimate.medincome & 0.0001 \\
## & (0.0001) \\
## & \\
## estimate.nohealthins & $-$0.842$^{**}$ \\
## & (0.373) \\
## & \\
## estimate.racewhite & $-$1.466$^{***}$ \\
## & (0.340) \\
## & \\
## estimate.raceblack & $-$1.294$^{***}$ \\
## & (0.355) \\
## & \\
## estimate.ethhispanic & $-$0.520 \\
## & (0.374) \\
## & \\
## Constant & 260.758$^{***}$ \\
## & (48.144) \\
## & \\
## \hline \\[-1.8ex]
## Observations & 525 \\
## R$^{2}$ & 0.228 \\
## Adjusted R$^{2}$ & 0.213 \\
## Residual Std. Error & 30.514 (df = 514) \\
## F Statistic & 15.216$^{***}$ (df = 10; 514) \\
## \hline
## \hline \\[-1.8ex]
## \textit{Note:} & \multicolumn{1}{r}{$^{*}$p$<$0.1; $^{**}$p$<$0.05; $^{***}$p$<$0.01} \\
## \end{tabular}
## \end{table}
As shown in the above maps, patients in Pennsylvania, on average, spend more time waiting in the ED than the national average (Figures 1 and 2). Additionally, patients who go to the ED with signs of a possible heart attack or stroke, on average, were received timely care less often (Figures 3 and 4). Patients with chest pain in PA had to wait longer before transfer to a specialty hospital than the US national average (Figure 5).
Though PA underperforms nationally, the regional graphs shows that PA performs the second best in the Northeast region (behind Vermont) with respect to general ED wait time (Fig. 6), but on par with the region for timely care for acute life-threatening symptoms (chest pain, stroke symptoms, Fig. 7). problem solving.
table 1-3 table 4
As expected, the largest number of hospitals are in the most densely populated counties (Fig. 8). fig 9, figs 10
Philadelphia, is the most populous and racially and ethnically diverse county, and has a greater proportion of aging individuals living in poverty and with poor access to health care.[AARP]
I would like to thank the following advisors, provided invaluable advice and guidance for data sources, mapping the data and assumptions for the regression analysis:
+ Dr. Gary Weissman, MD, MSHP (Assistant Professor of Medicine and Senior Fellow Leonard Davies Institute)
+ Dr. John Holmes, PhD, FACE, FACMI (Professor of Medical Informatics in Epidemiology, Associate Director of the Institute for Biomedical Informatics)
I would like to thank my classmates in BMIN 503 for providing inspiration as well as valuable peer-peer
sessionInfo()## R version 4.1.1 (2021-08-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] stargazer_5.2.2 zcta_0.0.1 zipcodeR_0.3.3 scales_1.1.1 tidycensus_1.1 ggsflabel_0.0.1 DT_0.20 tigris_1.5 ggmap_3.0.0 choroplethrMaps_1.0.1 choroplethr_3.7.0 acs_2.1.4 XML_3.99-0.8 tmap_3.3-2 spDataLarge_2.0.1 spData_2.0.1 raster_3.5-2 cowplot_1.1.1
## [19] mapproj_1.2.7 maps_3.4.0 RColorBrewer_1.1-2 viridis_0.6.2 viridisLite_0.4.0 rgdal_1.5-27 sp_1.4-6 sf_1.0-4 ggsn_0.5.0 leaflet_2.0.4.1 mapview_2.10.0 modelsummary_0.9.4 janitor_2.1.0 gridExtra_2.3 lubridate_1.8.0 survey_4.1-1 Matrix_1.3-4 ggpubr_0.4.0
## [37] Hmisc_4.6-0 Formula_1.2-4 survival_3.2-13 lattice_0.20-45 rgenoud_5.8-3.0 broom_0.7.10 kableExtra_1.3.4 reshape_0.8.8 data.table_1.14.2 tableone_0.13.0 plyr_1.8.6 readxl_1.3.1 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4 readr_2.1.0 tidyr_1.1.4
## [55] tibble_3.1.6 ggplot2_3.3.5 tidyverse_1.3.1 rmarkdown_2.11 devtools_2.4.2 usethis_2.1.3 knitr_1.36
##
## loaded via a namespace (and not attached):
## [1] pacman_0.5.1 utf8_1.2.2 tidyselect_1.1.1 RSQLite_2.2.8 htmlwidgets_1.5.4 maptools_1.1-2 munsell_0.5.0 codetools_0.2-18 units_0.7-2 withr_2.4.2 colorspace_2.0-2 tables_0.9.6 highr_0.9 uuid_1.0-3 rstudioapi_0.13 stats4_4.1.1
## [17] wk_0.5.0 ggsignif_0.6.3 labeling_0.4.2 RgoogleMaps_1.4.5.3 WDI_2.7.4 farver_2.1.0 bit64_4.0.5 rprojroot_2.0.2 vctrs_0.3.8 generics_0.1.1 xfun_0.28 R6_2.5.1 RJSONIO_1.3-1.6 bitops_1.0-7 cachem_1.0.6 assertthat_0.2.1
## [33] nnet_7.3-16 gtable_0.3.0 lwgeom_0.2-8 leafpop_0.1.0 processx_3.5.2 rlang_0.4.12 systemfonts_1.0.2 splines_4.1.1 rstatix_0.7.0 dichromat_2.0-0 brew_1.0-6 checkmate_2.0.0 s2_1.0.7 yaml_2.2.1 abind_1.4-5 modelr_0.1.8
## [49] crosstalk_1.2.0 backports_1.3.0 tools_4.1.1 ellipsis_0.3.2 jquerylib_0.1.4 proxy_0.4-26 sessioninfo_1.2.1 Rcpp_1.0.7 base64enc_0.1-3 classInt_0.4-3 ps_1.6.0 prettyunits_1.1.1 rpart_4.1-15 tmaptools_3.1-1 ggrepel_0.9.1 haven_2.4.3
## [65] cluster_2.1.2 fs_1.5.0 leafem_0.1.6 magrittr_2.0.1 reprex_2.0.1 pkgload_1.2.3 hms_1.1.1 evaluate_0.14 jpeg_0.1-9 testthat_3.1.0 compiler_4.1.1 KernSmooth_2.23-20 crayon_1.4.2 htmltools_0.5.2 tzdb_0.2.0 DBI_1.1.1
## [81] dbplyr_2.1.1 rappdirs_0.3.3 car_3.0-12 cli_3.1.0 mitools_2.4 parallel_4.1.1 pkgconfig_2.0.3 foreign_0.8-81 terra_1.4-11 xml2_1.3.2 svglite_2.0.0 bslib_0.3.1 webshot_0.5.2 rematch_1.0.1 rvest_1.0.2 snakecase_0.11.0
## [97] callr_3.7.0 digest_0.6.28 cellranger_1.1.0 leafsync_0.1.0 htmlTable_2.3.0 curl_4.3.2 satellite_1.0.4 rjson_0.2.20 lifecycle_1.0.1 jsonlite_1.7.2 carData_3.0-4 desc_1.4.0 fansi_0.5.0 pillar_1.6.4 fastmap_1.1.0 httr_1.4.2
## [113] pkgbuild_1.2.0 glue_1.5.0 remotes_2.4.1 png_0.1-7 leaflet.providers_1.9.0 bit_4.0.4 class_7.3-19 stringi_1.7.5 sass_0.4.0 blob_1.2.2 latticeExtra_0.6-29 stars_0.5-4 memoise_2.0.0 e1071_1.7-9